Skip to content

Conversation

@5herhom
Copy link

@5herhom 5herhom commented Oct 25, 2025

Detail: FLINK-38531
Before fixed, it can not pass the newly added case BinlogOffsetTest#testCompareToWithGtidSetAndSkipEventsAndSkipRows and BinlogSplitReaderTest#testRestoreFromCheckpointWithGtidSetAndSkippingEventsAndRows.
Before fixed, the situation of data loss can be reproduced under the the newly added case BinlogSplitReaderTest#testRestoreFromCheckpointWithGtidSetAndSkippingEventsAndRows.

@5herhom
Copy link
Author

5herhom commented Oct 25, 2025

Please help review this PR. Thank you. @lvyanquan

@5herhom 5herhom force-pushed the fix-loss-data branch 4 times, most recently from 866aff3 to 3764c6d Compare October 27, 2025 13:47
@lvyanquan lvyanquan self-assigned this Nov 3, 2025
@leonardBang leonardBang self-requested a review November 4, 2025 02:36
@leonardBang
Copy link
Contributor

@yuxiqian Would you like to help review this PR when you have time?

@lvyanquan
Copy link
Contributor

Hi @5herhom, Thank you for pointing out this issue. Since BinlogOffset is a critical class, could you keep the changes minimal to facilitate review and maintenance?

@5herhom
Copy link
Author

5herhom commented Dec 4, 2025

Hi @5herhom, Thank you for pointing out this issue. Since BinlogOffset is a critical class, could you keep the changes minimal to facilitate review and maintenance?
Ok,I will do that later.

… checkpoint positioned in the middle of a bulk DML operation.
@5herhom
Copy link
Author

5herhom commented Dec 4, 2025

Hi @5herhom, Thank you for pointing out this issue. Since BinlogOffset is a critical class, could you keep the changes minimal to facilitate review and maintenance?
@lvyanquan I have pushed a new commit. Please have a look again. Thank you.


/**
* In a bad case, it will skip the rest records whitch causes infinite wait for empty data. So
* it should has a timeout to avoid it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitch => which.

Could you explain when will a 'bad case' happen?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain when will a 'bad case' happen?

This bad case occurred before this fix. If the code before this fix is used to run this case, the situation will occur.

return Long.compare(restartSkipEvents, targetRestartSkipEvents);
}
// The completed events are the same, so compare the row number ...
return Long.compare(this.getRestartSkipRows(), that.getRestartSkipRows());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is reasonable for me.

Could @yuxiqian or @ruanhang1993 double check for this too?

@lvyanquan lvyanquan added this to the V3.6.0 milestone Dec 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants